<<<<<<< HEAD ======= <<<<<<< HEAD <<<<<<< HEAD >>>>>>> dea707e (rebase) <<<<<<< HEAD ======= ======= <<<<<<< HEAD >>>>>>> 1c3abb6 (..,,,,,,,...) ======= >>>>>>> 3aa823a (//) <<<<<<< HEAD >>>>>>> 94931f1 (.) <<<<<<< HEAD >>>>>>> 891ff78 (rebase) ======= ======= ======= >>>>>>> 57796f2 (update) >>>>>>> 275e88a (.) >>>>>>> dea707e (rebase) TSAR Complete Workflow

TSAR Workflow

Pipeline analysis from raw data reading to graphic visualization

1. Load Package and other relevant packages

Install package using devtools. Usage of dplyr and ggplot2 along with TSAR package is recommended for enhanced analysis.

library(devtools)
## Loading required package: usethis
devtools::install_local("/Users/candygao/Desktop/TSAR_0.1.0.tar.gz")
## 
## ── R CMD build ─────────────────────────────────────────────────────────────────
##   
<<<<<<< HEAD
<<<<<<< HEAD
   checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmpbi35lg/remotes130d512ce11f/TSAR/DESCRIPTION’ ...
  
✔  checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmpbi35lg/remotes130d512ce11f/TSAR/DESCRIPTION’
=======
=======
<<<<<<< HEAD
>>>>>>> dea707e (rebase)
   checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmp2oUQwc/remotes14a5b62b40dbf/TSAR/DESCRIPTION’ ...
  
✔  checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmp2oUQwc/remotes14a5b62b40dbf/TSAR/DESCRIPTION’
<<<<<<< HEAD
>>>>>>> 2ac23ea (rebase)
=======
=======
   checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/RtmpNljR3n/remotes15aa586fcfd/TSAR/DESCRIPTION’ ...
  
✔  checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/RtmpNljR3n/remotes15aa586fcfd/TSAR/DESCRIPTION’
<<<<<<< HEAD
>>>>>>> 1c3abb6 (..,,,,,,,...)
=======
>>>>>>> 3aa823a (//)
<<<<<<< HEAD
>>>>>>> 94931f1 (.)
<<<<<<< HEAD
>>>>>>> 891ff78 (rebase)
=======
=======
=======
   checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmp2oUQwc/remotes14a5b62b40dbf/TSAR/DESCRIPTION’ ...
  
✔  checking for file ‘/private/var/folders/xl/qmtpk7rs22xdzzmpmy3vgy0h0000gn/T/Rtmp2oUQwc/remotes14a5b62b40dbf/TSAR/DESCRIPTION’
>>>>>>> 57796f2 (update)
>>>>>>> 275e88a (.)
>>>>>>> dea707e (rebase)
## 
  
─  preparing ‘TSAR’:
##    checking DESCRIPTION meta-information ...
  
✔  checking DESCRIPTION meta-information
## 
  
   checking vignette meta-information ...
  
✔  checking vignette meta-information
## 
  
─  checking for LF line-endings in source and make files and shell scripts
## 
  
─  checking for empty or unneeded directories
## 
  
─  building ‘TSAR_0.1.0.tar.gz’
## 
  
   
## 
library(TSAR)
## 
## Attaching package: 'TSAR'
## The following object is masked from 'package:graphics':
## 
##     screen
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

2. Load Data

Read data in .txt or .csv format. Use read.delim function to input tab delimited file; use read.csv to input comma separated files. Other formats of input are welcomed as long as data is stored in data frame structure as numeric type. Ensure excessive lines are removed. (e.g. skip = , nrows = ) Means to check these are view(), pre-opening data file in excel, or manually removing all excessive data before input reading. Package defaults variable names as “Well.Position”, “Temperature”, “Fluorescence”, “Normalized”. Consider renaming data frame before proceding to following step.

raw_data <- read.csv(file = "/Users/candygao/Desktop/qpcrresult/experiment file/Vitamin_RawData_Thermal Shift_02_162.eds.csv", header = TRUE, nrows = 118176)

3. Data Pre-Processing

Select data of individual cell for pre-analysis screening. e.g. select well A1

test <- raw_data %>%
  filter(Well.Position == "A01") # select data for one well by well ID

#section data by temperature to remove messy area
#test <- subset(test, test$Temperature>40 & test$Temperature <50) 

Run example analysis on one well to screen potential errors and enhancement of model.

#normalize fluorescence reading into scale between 0 and 1
test <- normalize(test, fluo = 5, selected = c("Well.Position", "Temperature", "Fluorescence", "Normalized")) 
head(test)
##   Well.Position Temperature Fluorescence Normalized
## 1           A01    21.97290     87464.91  0.4339496
## 2           A01    22.03227     87437.66  0.4337473
## 3           A01    22.09164     87410.72  0.4335473
## 4           A01    22.15101     87384.08  0.4333495
## 5           A01    22.21038     87357.69  0.4331535
## 6           A01    22.26976     87331.48  0.4329590
gammodel <- model_gam(test, x = test$Temperature, y = test$Normalized)
test <- model_fit(test, model = gammodel) #, smoothed="Fluorescence")

Output analysis result using ggplot to view normalized data and fitted model. Determine is any noise need to be removed (i.e. subsetting by temperature range). Determine which model is the best (i.e. is currrent data already smoothed, does fitted model suit well.) Determine if tm-estimation is proper. *current model assumes derivative estimation of tm value.

tm_est(test)
## [1] 53.73642
view_model(test)
<<<<<<< HEAD

Screen all wells for curve shape on raw_data set and sift out corrupted data. This step is not required but may help remove data modeling errors.

=======

<<<<<<< HEAD Screen all wells for curve shape on raw_data set and sift out corrupted data. This step is not required but may help remove data modeling errors.

>>>>>>> 2ac23ea (rebase)
myApp <- weed_raw(raw_data, checkrange= c("A","C","1","12"))
#shiny::runApp(myApp)

t_data <- remove_raw(raw_data, 
                     removelist = c('C04','B04','C03','C10','B11','C11','C02',
                                    'C12','C01','B06','C07','C09','C08','B05',
                                    'B07','C06','B12','C05','B08','B02','B01',
                                    'B03','B09','B10'))

t_data <- remove_raw(t_data, removerange = c("D","H","1","12"))
screen(t_data)

<<<<<<< HEAD ======= ======= result

======= >>>>>>> 57796f2 (update) Screen all wells for curve shape on raw_data set and sift out corrupted data. This step is not required but may help remove data modeling errors.

myApp <- weed_raw(raw_data, checkrange= c("A","C","1","12"))
#shiny::runApp(myApp)

t_data <- remove_raw(raw_data, 
                     removelist = c('C04','B04','C03','C10','B11','C11','C02',
                                    'C12','C01','B06','C07','C09','C08','B05',
                                    'B07','C06','B12','C05','B08','B02','B01',
                                    'B03','B09','B10'))

t_data <- remove_raw(t_data, removerange = c("D","H","1","12"))
<<<<<<< HEAD
screen(t_data,  checkrange= c("A","C","1","12"))

<<<<<<< HEAD >>>>>>> 1c3abb6 (..,,,,,,,...) ======= >>>>>>> 3aa823a (//) <<<<<<< HEAD >>>>>>> 94931f1 (.) <<<<<<< HEAD >>>>>>> 891ff78 (rebase) ======= ======= ======= screen(t_data)

>>>>>>> 57796f2 (update) >>>>>>> 275e88a (.) >>>>>>> dea707e (rebase)

4. 96-well Analysis Application

TSAR package excels in mass analysis by propagating identical protocols to all 96 wells.

5. Intermediate Data Output

Read analysis using read_tsar() function and view head and tail to ensure appropriate output was achieved. Data output can also be saved locally into .csv or .txt format using function wrtie_tsar. However, pipeline to downstream analysis does not require output to be locally saved.

##   Well.Position       tm
## 1           A01 53.49893
## 2           A02 54.33012
## 3           A03 53.79578
## 4           A04 54.68635
## 5           A05 54.27076
## 6           A06 55.04258
##    Well.Position       tm
## 7            A07 53.14270
## 8            A08 55.16132
## 9            A09 53.97390
## 10           A10 55.04258
## 11           A11 54.27076
## 12           A12 54.68635

6. Complete Dataset with Ligand and Protein Information

For downstream analysis, data need to be mapped towards specific ligand and compound. Use may input by default excel template included in the package or input as .txt or .csv table, specifying Ligand and Compound by Well ID. Data with coumpound and ligand labels can also be stored locally using the same mean as previous step. All data are kept despite blank input. In case removal is needed, call function na.omit().

## New names:
## • `` -> `...1`
## • `Protein` -> `Protein...2`
## • `Ligand` -> `Ligand...3`
## • `Protein` -> `Protein...4`
## • `Ligand` -> `Ligand...5`
## • `Protein` -> `Protein...6`
## • `Ligand` -> `Ligand...7`
## • `Protein` -> `Protein...8`
## • `Ligand` -> `Ligand...9`
## • `Protein` -> `Protein...10`
## • `Ligand` -> `Ligand...11`
## • `Protein` -> `Protein...12`
## • `Ligand` -> `Ligand...13`
## • `Protein` -> `Protein...14`
## • `Ligand` -> `Ligand...15`
## • `Protein` -> `Protein...16`
## • `Ligand` -> `Ligand...17`
## • `Protein` -> `Protein...18`
## • `Ligand` -> `Ligand...19`
## • `Protein` -> `Protein...20`
## • `Ligand` -> `Ligand...21`
## • `Protein` -> `Protein...22`
## • `Ligand` -> `Ligand...23`
## • `Protein` -> `Protein...24`
## • `Ligand` -> `Ligand...25`
##   Well.Position Temperature Fluorescence Normalized       tm Protein Ligand
## 1           A01    21.97290     87464.91  0.4339496 53.49893   CA FL   DMSO
## 2           A01    22.03227     87437.66  0.4337473 53.49893   CA FL   DMSO
## 3           A01    22.09164     87410.72  0.4335473 53.49893   CA FL   DMSO
## 4           A01    22.15101     87384.08  0.4333495 53.49893   CA FL   DMSO
## 5           A01    22.21038     87357.69  0.4331535 53.49893   CA FL   DMSO
## 6           A01    22.26976     87331.48  0.4329590 53.49893   CA FL   DMSO
##       Well.Position Temperature Fluorescence  Normalized       tm Protein
## 14767           A12    94.70245     15343.25 0.009985017 54.68635   CA FL
## 14768           A12    94.76181     15222.33 0.007978866 54.68635   CA FL
## 14769           A12    94.82118     15101.69 0.005977393 54.68635   CA FL
## 14770           A12    94.88055     14981.33 0.003980515 54.68635   CA FL
## 14771           A12    94.93993     14861.24 0.001988068 54.68635   CA FL
## 14772           A12    94.99930     14741.41 0.000000000 54.68635   CA FL
##           Ligand
## 14767 PyxINE HCl
## 14768 PyxINE HCl
## 14769 PyxINE HCl
## 14770 PyxINE HCl
## 14771 PyxINE HCl
## 14772 PyxINE HCl

7. Merge Data across Biological Replicates

Repeat step 2 through 6 on replicate data set. A five step function call will complete all analysis. If additional screening is desired, a two step call will run the interactive window to allow selection of

raw_data_rep <- read.csv(file = "/Users/candygao/Desktop/qpcrresult/experiment file/Vitamin_RawData_Thermal Shift_02_168.eds.csv", header = TRUE, nrows = 118176)
<<<<<<< HEAD
<<<<<<< HEAD
screen(raw_data_rep)

#remove blank wells and weed out corrupted curves
=======
=======
<<<<<<< HEAD
>>>>>>> dea707e (rebase)

#remove blank wells and weed out corrupted curves
>>>>>>> 2ac23ea (rebase)
raw_data_rep <- remove_raw(raw_data_rep, removerange = c("B","H","1","12"))
myApp <- weed_raw(raw_data_rep) 
#shiny::runApp(myApp)
raw_data_rep <- remove_raw(raw_data_rep, removelist = 'A12')
screen(raw_data_rep)

analysis_rep <- gam_analysis(raw_data_rep , smoothed = T)
<<<<<<< HEAD
=======
=======
screen(raw_data_rep)

#remove blank wells and weed out corrupted curves
=======

#remove blank wells and weed out corrupted curves
>>>>>>> 57796f2 (update)
raw_data_rep <- remove_raw(raw_data_rep, removerange = c("B","H","1","12"))
myApp <- weed_raw(raw_data_rep) 
#shiny::runApp(myApp)
raw_data_rep <- remove_raw(raw_data_rep, removelist = 'A12')
<<<<<<< HEAD


analysis_rep <- gam_analysis(raw_data_rep , smoothed = T)
<<<<<<< HEAD
>>>>>>> 1c3abb6 (..,,,,,,,...)
=======
>>>>>>> 3aa823a (//)
<<<<<<< HEAD
>>>>>>> 94931f1 (.)
<<<<<<< HEAD
>>>>>>> 891ff78 (rebase)
=======
=======
=======
screen(raw_data_rep)

analysis_rep <- gam_analysis(raw_data_rep , smoothed = T)
>>>>>>> 57796f2 (update)
>>>>>>> 275e88a (.)
>>>>>>> dea707e (rebase)
output_rep <- read_tsar(analysis_rep, code = 2)
norm_data_rep <- join_well_info("/Users/candygao/Desktop/qpcrresult/experiment file/0203Well Information.xlsx", output_rep, type = "by_template")
## New names:
## • `` -> `...1`
## • `Protein` -> `Protein...2`
## • `Ligand` -> `Ligand...3`
## • `Protein` -> `Protein...4`
## • `Ligand` -> `Ligand...5`
## • `Protein` -> `Protein...6`
## • `Ligand` -> `Ligand...7`
## • `Protein` -> `Protein...8`
## • `Ligand` -> `Ligand...9`
## • `Protein` -> `Protein...10`
## • `Ligand` -> `Ligand...11`
## • `Protein` -> `Protein...12`
## • `Ligand` -> `Ligand...13`
## • `Protein` -> `Protein...14`
## • `Ligand` -> `Ligand...15`
## • `Protein` -> `Protein...16`
## • `Ligand` -> `Ligand...17`
## • `Protein` -> `Protein...18`
## • `Ligand` -> `Ligand...19`
## • `Protein` -> `Protein...20`
## • `Ligand` -> `Ligand...21`
## • `Protein` -> `Protein...22`
## • `Ligand` -> `Ligand...23`
## • `Protein` -> `Protein...24`
## • `Ligand` -> `Ligand...25`
norm_data_rep <- na.omit(norm_data_rep)

Merge data by content. All data are marked its source file name and experiment date.

norm_data <- na.omit(norm_data)
norm_data_rep <- na.omit(norm_data_rep)
Bigdata <- merge_norm(norm_data, norm_data_rep, 
                      "Vitamin_RawData_Thermal Shift_02_162.eds.csv", 
                      "Vitamin_RawData_Thermal Shift_02_168.eds.csv", 
                      "20230203", "20230209")

8. Tm Estimation Shift Visualization

Use condition_IDs() and well_IDs() to select or remove condition to visualize. Visualize Tm estimation by compound or ligand type in the format of box graph.

condition_IDs(Bigdata)
## [1] "CA FL_DMSO"       "CA FL_CAI"        "CA FL_BIOTIN"     "CA FL_4-ABA"     
## [5] "CA FL_=+-LA"      "CA FL_PyxINE HCl"
well_IDs(Bigdata)
##  [1] "A01_CA FL_DMSO_20230203"       "A02_CA FL_DMSO_20230203"      
##  [3] "A03_CA FL_CAI_20230203"        "A04_CA FL_CAI_20230203"       
##  [5] "A05_CA FL_BIOTIN_20230203"     "A06_CA FL_BIOTIN_20230203"    
##  [7] "A07_CA FL_4-ABA_20230203"      "A08_CA FL_4-ABA_20230203"     
##  [9] "A09_CA FL_=+-LA_20230203"      "A10_CA FL_=+-LA_20230203"     
## [11] "A11_CA FL_PyxINE HCl_20230203" "A12_CA FL_PyxINE HCl_20230203"
## [13] "A01_CA FL_DMSO_20230209"       "A02_CA FL_DMSO_20230209"      
## [15] "A03_CA FL_CAI_20230209"        "A04_CA FL_CAI_20230209"       
## [17] "A05_CA FL_BIOTIN_20230209"     "A06_CA FL_BIOTIN_20230209"    
## [19] "A07_CA FL_4-ABA_20230209"      "A08_CA FL_4-ABA_20230209"     
## [21] "A09_CA FL_=+-LA_20230209"      "A10_CA FL_=+-LA_20230209"     
## [23] "A11_CA FL_PyxINE HCl_20230209"
conclusion <- Bigdata%>%
  filter(condition_ID != "NA_NA") %>%
  filter(condition_ID != "CA FL_Riboflavin")
TSA_boxplot(conclusion, color_by = "Protein", label_by = "Ligand", separate_legend = FALSE)

9. TSA Curve Visualization

Specify Control condition by assigning condition_ID to control. tsa_compare_plot generated multiple line graphs for comparison.

control_ID <- "CA FL_DMSO"

tsa_compare_plot(conclusion,
                 y = "RFU",
                 control_condition = control_ID)
## $`CA FL_CAI`

## 
## $`CA FL_BIOTIN`

## 
## $`CA FL_4-ABA`

## 
## $`CA FL_=+-LA`

## 
## $`CA FL_PyxINE HCl`

## 
## $`Control: CA FL_DMSO`

10. Data Correction

test_outlier <- join_well_info("/Users/candygao/Desktop/qpcrresult/experiment file/0203Well Information.xlsx", read_tsar(x, code = 0), type = "by_template")
## New names:
## • `` -> `...1`
## • `Protein` -> `Protein...2`
## • `Ligand` -> `Ligand...3`
## • `Protein` -> `Protein...4`
## • `Ligand` -> `Ligand...5`
## • `Protein` -> `Protein...6`
## • `Ligand` -> `Ligand...7`
## • `Protein` -> `Protein...8`
## • `Ligand` -> `Ligand...9`
## • `Protein` -> `Protein...10`
## • `Ligand` -> `Ligand...11`
## • `Protein` -> `Protein...12`
## • `Ligand` -> `Ligand...13`
## • `Protein` -> `Protein...14`
## • `Ligand` -> `Ligand...15`
## • `Protein` -> `Protein...16`
## • `Ligand` -> `Ligand...17`
## • `Protein` -> `Protein...18`
## • `Ligand` -> `Ligand...19`
## • `Protein` -> `Protein...20`
## • `Ligand` -> `Ligand...21`
## • `Protein` -> `Protein...22`
## • `Ligand` -> `Ligand...23`
## • `Protein` -> `Protein...24`
## • `Ligand` -> `Ligand...25`
error <- conclusion %>% filter(condition_ID == 'CA FL_PyxINE HCl')
TSA_wells_plot(error, separate_legend = FALSE)
## Warning: Use of `tsa_data$Temperature` is discouraged.
## ℹ Use `Temperature` instead.
## Warning: Use of `tsa_data$Normalized` is discouraged.
## ℹ Use `Normalized` instead.